Classification of general audio data for content-based retrieval

نویسندگان

  • Dongge Li
  • Ishwar K. Sethi
  • Nevenka Dimitrova
  • Thomas McGee
چکیده

In this paper, we address the problem of classi®cation of continuous general audio data (GAD) for content-based retrieval, and describe a scheme that is able to classify audio segments into seven categories consisting of silence, single speaker speech, music, environmental noise, multiple speakers' speech, simultaneous speech and music, and speech and noise. We studied a total of 143 classi®cation features for their discrimination capability. Our study shows that cepstralbased features such as the Mel-frequency cepstral coecients (MFCC) and linear prediction coecients (LPC) provide better classi®cation accuracy compared to temporal and spectral features. To minimize the classi®cation errors near the boundaries of audio segments of di€erent type in general audio data, a segmentation±pooling scheme is also proposed in this work. This scheme yields classi®cation results that are consistent with human perception. Our classi®cation system provides over 90% accuracy at a processing speed dozens of times faster than the playing rate. Ó 2000 Elsevier Science B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Approach For Classification Of Generic Audio Data

The existing audio retrieval systems fall into one of two categories: single-domain systems that can accept data of only a single type (e.g. speech) or multiple-domain systems that offer content-based retrieval for multiple types of audio data. Since a single-domain system has limited applications, a multiple-domain system will be more useful. However, different types of audio data will have di...

متن کامل

Multifeature Audio Segmentation for Browsing and Annotation

Indexing and content-based retrieval are necessary to handle the large amounts of audio and multimedia data that is becoming available on the web and elsewhere. Since manual indexing using existing audio editors is extremely time consuming a number of automatic content analysis systems have been proposed. Most of these systems rely on speech recognition techniques to create text indices. On the...

متن کامل

Intelligent Content-Based Audio Classification and Retrieval for Web Applications

Content-based technology has emerged from the development of multimedia signal processing and wide spread of web application. In this chapter, we discuss the issues involved in the content-based audio classification and retrieval, including spoken document retrieval and music information retrieval. Further, along this direction, we conclude that the emerging audio ontology can be applied in fas...

متن کامل

Content-Based Audio Classification and Retrieval Using SVM Learning

In this paper, a support vector machines (SVMs) based method is proposed for content-based audio classification and retrieval. Given a feature set, which in this work is composed of perceptual and cepstral feature, optimal class boundaries between classes are learned from training data by using SVMs. Matches are ranked by using distances from boundaries. Experiments are presented to compare var...

متن کامل

Precision ��

As one of the key methods to extract content semantics and structure from audio, automatic audio classification, especially for a speech and a music, is valuable for content-based audio retrieval, video summary and retrieval, and spoken document retrieval, etc. Because hidden Markov model (HMM) can well model audio signal’s time statistical properties, a left-right discrete HMM is proposed to c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Pattern Recognition Letters

دوره 22  شماره 

صفحات  -

تاریخ انتشار 2001